Modular organization enhances the robustness of attractor network dynamics 
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Modular organization characterizes many complex networks occurring in nature, including the 
brain. In this paper we show that modular structure may be responsible for increasing the robust- 
ness of certain dynamical states of such systems. In a neural network model with threshold-activated 
binary elements, we observe that the basins of attractors, corresponding to patterns that have been 
embedded using a learning rule, occupy maximum volume in phase space at an optimal modularity. 
Simultaneously, the convergence time to these attractors decreases as a result of cooperative dynam- 
ics between the modules. The role of modularity in increasing global stability of certain desirable 
attractors of a system may provide a clue to its evolution and ubiquity in natural systems. 
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An ubiquitous property of complex systems is their 
modular organization [H, characterized by communities 
of densely connected elements with sparser connections 
between the different communities [2]. In the biologi- 
cal world, modules are seen to occur across many length 
scales, from the intra-cellular networks of protein-protein 
interactions 0, 0| and signaling pathways [j| to food 
webs comprising multiple species populations jg]. Al- 
though such groupings are primarily defined in terms of 
the structural features of the network topology, in several 
instances distinct modules have also been associated with 
specific functions. Indeed, in the case of the brain, mod- 
ular organization at the anatomical level has long been 
thought to be paralleled at the functional level of cogni- 
tion [7[. By observing the effects of isolating or discon- 
necting different brain areas on the behavior of subjects, 
the functional specialization of spatially distinct modules 
have been established at different length scales 8] - from 
hemispheric specialization to minicolumns comprising a 
few hundred cells which have been proposed as the basic 
information processing units of the cerebral cortex [ilfioj]. 
More recently, the analysis of neurobiological data using 
graph theoretic techniques fill has further established 
the modular nature of inter-connections between differ- 
ent areas of the mammalian cortex. The structural mod- 
ules revealed by tracing the anatomical connections in 
mammalian brains (T3 . Il3j are complemented by the ob- 
servation of functionally defined networks having mod- 
ular character [3, EH- Such functional networks have 
been reconstructed from MRI and fMRI experiments on 
both human [l|| and non-human pjj subjects, by con- 
sidering two brain areas to be connected if they are si- 
multaneously active when the subject performs a specific 
behavioral task. 

The wide-spread occurrence of modularity prompts the 
question as to why this structural organization is so ubiq- 
uitous 18]. One possible reason is that it enhances com- 
munication efficiency by decreasing the average network 
path length while allowing high clustering to help localize 



signals within subnetworks |19j |. However, of more inter- 
est is the possibility that modularity may play a crucial 
role in the principal function of the system, viz., infor- 
mation processing in the case of brain networks. This 
possibility has been investigated in detail for the somatic 
nervous system of the nematode C. elegans 20]. It is 
therefore intriguing to speculate whether modularity is 
responsible for efficient information processing in brains 
of more evolved organisms, the mammalian cortex in par- 
ticular. To explore this idea further we can study the 
effect of modular structure on the dynamics of attractor 
network models with threshold-activated nodes, which 
exhibit multiple stable states or "memories" 2l|, |22 |. 



memories 

These models were originally developed to understand 
how the nervous system communicates among its compo- 
nent parts and learns associations between different stim- 
uli so that a memorized pattern can be retrieved in its 
entirety from a small part or a noise-corrupted version of 
it given as input ("associative memory"). Indeed, recent 
experiments indicate that the spatiotemporal activation 
dynamics in neocortical networks converge to one of sev- 
eral different persistent, stable patterns which resemble 
the behavior observed in such models 23[. However, the 



properties of attractor networks are of more general in- 
terest and have been used to understand systems outside 
the domain of neurobiology, as for example, the network 
involved in intracellular signaling where communication 
between molecules within a cell take place through mul- 
tiple interacting pathways [3, [lH . In the attractor net- 
works, desired patterns are stored by using a learning rule 
to determine the connection weights between the nodes. 
This ensures that the update (or recall) dynamics of the 
network makes it converge to these pre-specified dynam- 
ical states when an input initial state of the system is 
transformed into an output state defined over the same 
set of nodes by the collective dynamics of the network. 
Using such simplified models have the advantage of mak- 
ing the observed phenomena simpler to analyze and also 
to obtain results that are independent of specific bio- 



logical details of different types of neurons and synaptic 
connections. 

In this paper we show that if we want to store p (say) 
patterns in a network with a given number of nodes and 
links, then the convergence to an attractor corresponding 
to any of the stored patterns (i.e., recall) will be most ef- 
ficient when the network has an optimal modular struc- 
ture, provided the number of patterns is not too large 
(p < Pmax)- If the degree of modularity is increased or 
decreased from the optimum value, the reliability with 
which the patterns are recalled decreases. This optimal 
efficiency of recall originates from the network dynamics 
itself. Some of the modules converge quickly to attrac- 
tors corresponding to parts of stored patterns and then 
help other modules to reach the attractor correspond- 
ing to the entire stored pattern through interactions via 
intermodular links. If the modularity is increased (i.e., 
if the number of intra-modular links is increased while 
reducing the number of inter-modular links to keep the 
average degree fixed), the modules cannot interact with 
each other strongly enough due to fewer number of in- 
termodular links and the performance of the network is 
less efficient. On the other hand, if the modularity is 
decreased, the modules themselves become sparsely con- 
nected and cannot reach an attractor rapidly. Also, if 
we try to store a larger number of patterns (p > p ma x), 
the advantage of modularity disappears because of the 
generation of a large number of spin-glass states which 
correspond to spurious patterns. 

The attractor network model we have used to investi- 
gate the role of modularity is constructed such that the 
N nodes comprising it are divided into n m modules, each 
having n {= N/n m ) nodes [3]. The connection probabil- 
ity between a pair of nodes belonging to the same mod- 
ule is pi, while that between nodes belonging to different 
modules is p D . The modular nature of the network can 
be varied continuously by altering the ratio of inter- to 
intra-modular connectivity, r — ^ G [0,1], keeping the 
average degree (k) fixed (Fig. [1]). For r = 0, the net- 
work is fragmented into n m isolated clusters, whereas at 
r = 1, it is a homogeneous or Erdos-Renyi random net- 
work. We ensure that the resulting adjacency matrix A 
(i.e., Aij — 1 if i, j are connected, and 0, otherwise) is 
symmetric. We have explicitly verified that the results 
reported below do not change appreciably if A is non- 
symmetric (corresponding to a directed network) . 

The time-evolution of the system is governed by the 
dynamics of the variables associated with each node of 
the network. An Ising spin Oi — ±1 is placed at each node 
which may represent any binary state variable, such as 
a two-state neuron (firing=l, inactive=— 1). The state 
of the spins are evaluated at discrete time-steps using 
random sequential updating according to the following 
deterministic (or zero temperature) dynamics: 

tr< (t + 1) = sign (£j A i:j W^djit)), ( 1 ) 
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(a) r = (b) r = 0.1 (c) r = 1 




FIG. 1: (a-c) Adjacency matrices A defining the network con- 
nections at different values of the modularity parameter r for 
N = 256 nodes arranged into n m = 4 modules (average de- 
gree (fc) = 60). Starting from a system of isolated clusters 
(a, r = 0), by increasing r we obtain modular networks (b, 
r = 0.1) eventually arriving at a homogeneous network (c, 
r — 1). The connection structure of modular networks in the 
intermediate range < r < 1 is shown schematically in (d). 
The connection weights have different magnitudes and signs. 

where, Wij is the connection strength between neurons 
i and j. The function sign(z) = 1, if z > 0, = —1, if 
z < and randomly chosen to be ±1 if z = 0. The 
weight associated with each link is evaluated using the 
Hebbian learning rule [22[ for storing p random patterns 
in an associative network: 

Wij = ^y^Uf^, W u = 0, (2) 

being the i-th component of the p-th pattern vector 
(/j = l,...p). Each of the stored patterns are gener- 
ated randomly by choosing each component to be +1 or 
— 1 with equal probability. Starting from an arbitrary 
initial state, the network eventually converges to a time- 
invariant stable state or attractor. The overlap of an at- 
tractor of the network dynamics S* — {er*} with any of 
the stored patterns can be measured as m M = i£i<7*£f . 
As we are interested in the set of all the attractors of 
stored patterns rather than one specific pattern, we fo- 
cus our attention on the maximum overlap with the 
stored patterns, m — max M |r7i M |. To examine the global 
stability of the attractors corresponding to the stored 
patterns, we use random strings as the initial state of 
the network which should have almost no overlap with 
any of the stored patterns, on average. The probabil- 
ity v g = (Prob(m > m )) that such a random initial 
state eventually almost converges to one of the stored 
patterns, gives an estimate of the overall volume that 
the basins of attraction of stored patterns occupy in the 
iV-dimensional network configuration space {S}. Here 
m is a threshold for the overlap of the asymptotic sta- 
ble state above which the network can be considered to 
have recalled a pattern successfully and (...) indicates 
averaging over many different network configurations A, 
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FIG. 2: Fractional volume of phase space occupied by the 
basins of attraction of the stored patterns in a single mod- 
ule (v m ) and the entire random modular network (v g ). Note 
that, when the number of stored patterns is within a crit- 
ical range (p m »n = 2 < p < p max = 9), these quantities 
show a non-monotonic variation with r, having a peak around 
r c ~ l/(n ro — 1) ~ 0.14. Results are shown for TV = 1024, 
n m — 8 and (k) = 120. Different numbers of stored patterns 
p are indicated using various symbols. 



as well as, pattern ensembles {£} and initial states. The 
value of the threshold m has been taken to be 0.95 for 
most of the analysis presented here; we have verified that 
varying it over a small range does not alter our results. 
In a similar way, we can define overlap for each module, 
m M (a) = iEjCr*(a)£f (a) where the sum is over all spins 
in the a-th module with a = 1, . . . , n m being an index 
running over the different modules. The relative size of 
the basins of attraction at the modular scale is character- 
ized by the quantity v m = {{Prob(m(a) > m )) a ), where 
m(a) = max M |m A1 (a)| and (. . .) a indicates averaging over 
all the modules. 

We first look at how the total volume of the configura- 
tion space occupied by the basins of attraction for stored 
patterns £ M changes as the modular character of the net- 
work is altered by varying r for a fixed (k). Fig. [2] shows 
the combined fractional volume of the phase space occu- 
pied by the basins of attraction of the stored patterns for 
the entire network (y g ) as well as for the corresponding 
sub-patterns in a single module (y m ). Different curves in- 
dicate various number of stored patterns p. We immedi- 
ately notice that while v m has finite values over the entire 
range of r, v g is zero at low values of r where a module 
is connected to the rest of the network by very few links, 
if at all. The value of r at which v g starts rising from 
appears to be independent of the number of stored pat- 
terns p. Below this value of r, the connectivity between 
the modules is insufficient to recall the entire stored pat- 
tern, even though individual modules may have complete 
overlap with different stored patterns. To explain the sit- 
uation, we can decompose each stored pattern in terms of 
n m sub-patterns defined over the different modules, viz., 
£P = {^'(a)}, where a = l,...,n m . Starting from a 
random initial state, a module a may converge to an at- 
tractor corresponding to any of the n m different subpat- 



terns As the recall dynamics within each module 

is nearly independent of the other modules for low r, they 
may each converge to sub-parts of different patterns, i.e., 
the value of p, would not be identical for the attractors of 
all the n rn modules. Thus, the resulting attractor for the 
entire network corresponds to a "chimera" memory state, 
j^ 1 (1), . . . , ^"rn (n m )}, i.e., a spurious pattern compris- 
ing fragments of different stored patterns 2(| 27 [. 

From the perspective of enhanced robustness of the 
dynamical attractors of the entire network, even more 
interesting is the behavior of v g and v m when r is in- 
creased further after the modules have become inter- 
connected appreciably. Over an intermediate range of 
Pmin < P < Pmax , we notice a non-monotonic variation of 
both v g and v m with respect to r. Fig. [2] shows that both 
curves attain a maximum around r c ~ ~ - 

where a neuron has the same number of connections with 
nodes belonging to its own module as it has with neurons 
belonging to different modules. When the relative num- 
ber of inter- modular connections are increased beyond r c , 
the fractional volume of configuration space occupied by 
the attractors corresponding to the stored patterns tend 
to decrease. This implies that the homogeneous network 
(r = 1) is actually less robust than its modular counter- 
part (r ~ r c ) in terms of global stability of the stored 
attractors. As p increases beyond p ma x, both v g and 
v m decrease at the resulting high loading fraction pj (k) 
through the generation of a large number of spin-glass 
states [23|. We have explicitly verified that the maxi- 
mum number of stored patterns p m ax beyond which the 
non-monotonic nature of the variation is lost, increases 
when the total number of neurons N is increased, keep- 
ing the overall density of connections, (k)/(N — 1), and 
the number of modules, n m , fixed pjj . 

For low values of p, i.e., p < p m in, both v g and v m 
increase with r eventually reaching 1 and becoming in- 
dependent of r once the connectivity between the mod- 
ules become appreciable. We find from our numerical 
results that p m i n — 2, independent of the system size 
N or other model parameters. This observation helps 
in identifying the key mechanism for the non-monotonic 
variation of v g with r. While at low r, v g is small because 
the low connectivity among modules favor the chimera 
states, at very large r the attractors corresponding to 
the stored patterns have to compete with mixed states. 
Mixed states are spurious attractors that correspond to 
symmetric combinations of an odd number of stored pat- 
terns (e.g., £ x + £ 2 + £ 3 ) which exist for all p > 2. This 
is explicitly shown by the distribution of the overlap, m, 
of the attractors of a network with any of the p stored 
patterns (shown in Fig. [3] for p = 4). For low values of r, 
the dominance of chimera states result in low overlap val- 
ues. When the modules become highly inter-connected 
as r — > 1, most randomly chosen initial strings will con- 
verge to a stored pattern resulting in a large peak at 
m = 1 in the overlap distribution. However, we also no- 
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FIG. 3: (a) Distribution of the overlap of the attractors of 
the network dynamics with the stored patterns in a random 
modular network, at different values of the modularity param- 
eter r. P(m) is the probability of having overlap m. Com- 
plete overlap with the stored patterns (m = 1) becomes more 
probable as r becomes larger than a threshold value. How- 
ever, at large values of r, there is a secondary peak around 
m g ~ 0.5 corresponding to mixed states (i.e., linear combina- 
tion of odd number of stored patterns). This peak shows a 
dip at r c ~ l/(n m — 1) ~ 0.14. (b) The variation, as a func- 
tion of r, of the fraction of total number of spurious attractors 
that are mixed states, f m ix- For r > r c , the mixed states ac- 
count for almost all the attractors not corresponding to any of 
the stored patterns. They can be either combinations having 
the same sign (square) or different signs (diamond). Results 
shown for N — 1024, n m = 8, (k) = 120 and number of stored 
patterns, p = 4. 



tice a smaller peak around m ~ 0.5, which corresponds 
to 3-pattern mixed states (which have overlap of 0.5 with 
each of the three constituent stored patterns). Note that 
as r is gradually decreased from 1, about r ~ r c the m 
distribution shows a sharp dip for overlaps around 0.5. 
This corresponds to an increase in the phase space vol- 
ume occupied by the attractors of the stored patterns at 
the expense of the mixed states. A similar dip in the dis- 
tribution is also observed for the corresponding overlap 
around 0.5 for each module (figure not shown). Thus, 
the cooperative interactions between the different mod- 
ules not only affect the recall dynamics at the global level, 
but also locally within each module. 

Fig. |31 (b) shows explicitly that the attractors not cor- 
responding to any of the stored patterns, belong almost 
exclusively to mixed states at high r. In principle, these 
combinations can be of same sign (e.g., f 1 + £ 2 + £ 3 ) or 
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FIG. 4: (a) The average convergence time r to different at- 
tractors in a random modular network, shown for individual 
modules {(r m ), circles) and the entire network ((r g ), squares). 
It is measured in terms of Monte Carlo (MC) steps required 
to reach a time-invariant state starting from a random initial 
configuration, (b) The difference in the average convergence 
times (in MC steps) to an attractor not corresponding to a 
stored pattern (m < 0.95) and to one of the stored patterns, 
At. The difference is shown for both an individual module 
(Ar m , circles) and the entire network (Ar 9 , squares). The 
peak close to r c ~ 0.14 corresponds to a significantly faster 
convergence to the stored patterns relative to the other at- 
tractors. Results shown for N = 1024, n m = 8, (k) = 120 
and p = 4. 



different signs (e.g., — £ 2 +£ 3 ). The curves correspond- 
ing to each of these show that although the latter has a 
higher number of possible combinations, it is the attrac- 
tors corresponding to the same sign combinations which 
occupy a larger portion of the phase space. This is a con- 
sequence of the Hebbian learning rule, which provides a 
bias for the same sign combinations in preference to the 
different sign combinations. 

So far we have discussed the long-time asymptotic 
properties of the system. The dynamical aspect repre- 
sented by the time required to reach equilibrium also 
exhibits unexpected properties. Fig. @] shows that the 
network converges faster to attractors corresponding to 
stored patterns as compared to mixed states (and other 
attractors that do not have significant overlap with any 
of the stored patterns), at both the modular and the 
network level. Moreover, this difference is slightly en- 
hanced close to r c , the modular configuration where the 
basins of the stored patterns cover the largest fraction 
of the configuration space. The non-monotonic varia- 
tion of the convergence time with decreasing modularity 
arises as a result of two competing effects: increasing r 
decreases the intra-modular connectivity, resulting in in- 
creasing time for each module to relax to an attractor; 
on the other hand, this is accompanied by an increase in 
the connections between modules, that eventually causes 
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the entire system to relax faster to attractors. This dy- 
namical picture provides us with a possible clue as to the 
enhanced global stability of the attractors correspond- 
ing to stored patterns close to r c . As there is a distinct 
time-scale separation between the convergence dynam- 
ics at the modular (or local) and at the global scale for 



terns (XI Plan) Project. 



such networks 



I , the state of a specific module may 



evolve to reach a sub-pattern corresponding to a part 
of one of the stored patterns much faster than the net- 
work can converge to an attractor. Once this happens, 
this module biases the convergence of the other modules 
connected to it (via Hebbian inter-modular links) to the 
pattern to which it has converged. This increases the 
likelihood of convergence of the entire network to a par- 
ticular pattern through cooperative behavior among the 
modules, something that is absent when the modules are 
disconnected or the network is homogeneous. 

In this paper we have shown that modular orga- 
nization in the connection structure of a network of 
threshold-activated elements can result in increased ro- 
bustness of dynamical attractors associated with certain 
pre-specified states. These states may represent solutions 
to computational tasks or implement memorized patterns 
of activity. The modularity of the network allows these 
states to cover the maximum volume of its phase space 
with their basins, an outcome of cooperative behavior 
between the convergence or recall dynamics in the differ- 
ent modules. Our results have special relevance to the 
question of how cognitive states arise from interactions 
between a large number of brain regions, each comprising 
many neurons. Neurobiological evidence exists that cor- 
tical activity consists of rapid integration of signals across 
brain regions that are in spatially distinct locations and 
which occurs in a self-organized manner through inter- 
actions between the elements of the network of brain ar- 
eas [30l | . The empirical observation of modular cortical 
organization and the occurrence of distinct, persistent 
activity patterns corresponding to attractor dynamics 
raises the intriguing possibility that evolution may have 
selected modularity because of the robustness it imparts 
to the underlying system. Future extensions of the work 
reported here may involve considering the effect of noise, 
i.e., investigating the recall dynamics at a finite tempera- 
ture. Another possibility is to investigate the role of hier- 
archical arrangement of modules that have recently been 
reported in different biological systems [3l|, |32| , including 
the brain 33J, [3J] . Our results may also potentially be 



used to understand why attractor networks with small- 
world connection topology show a small increase in global 
stability relative to random networks, alth oug h the local 
stability of stored patterns are unaffected [35l438| . 
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